Minimising semantic drift with Mutual Exclusion Bootstrapping

نویسندگان

  • James R. Curran
  • Tara Murphy
  • Bernhard Scholz
چکیده

Iterative bootstrapping techniques are commonly used to extract lexical semantic resources from raw text. Their major weakness is that, without costly human intervention, the extracted terms (often rapidly) drift from the meaning of the original seed terms. In this paper we proposeMutual Exclusion bootstrapping (MEB) in which multiple semantic classes compete for each extracted term. This significantly reduces the problem of semantic drift by providing boundaries for the semantic classes. We demonstrate the superiority of MEB to standard bootstrapping in extracting named entities from the GoogleWeb 1T 5-grams. Finally, we demonstrate that MEB is a multi-way cut problem over semantic classes, terms and contexts.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Weighted Mutual Exclusion Bootstrapping for Domain Independent Lexicon and Template Acquisition

We present the Weighted Mutual Exclusion Bootstrapping (WMEB) algorithm for simultaneously extracting precise semantic lexicons and templates for multiple categories. WMEB is capable of extracting larger lexicons with higher precision than previous techniques, successfully reducing semantic drift by incorporating new weighting functions and a cumulative template pool while still enforcing mutua...

متن کامل

Experiments in Mutual Exclusion Bootstrapping

Mutual Exclusion Bootstrapping (MEB) was designed to overcome the problem of semantic drift suffered by iterative bootstrapping, where the meaning of extracted terms quickly drifts from the original seed terms (Curran et al., 2007). MEB works by extracting mutually exclusive classes in parallel which constrain each other. In this paper we explore the strengths and limitations of MEB by applying...

متن کامل

Graph-based Analysis of Semantic Drift in Espresso-like Bootstrapping Algorithms

Bootstrapping has a tendency, called semantic drift, to select instances unrelated to the seed instances as the iteration proceeds. We demonstrate the semantic drift of bootstrapping has the same root as the topic drift of Kleinberg’s HITS, using a simplified graphbased reformulation of bootstrapping. We confirm that two graph-based algorithms, the von Neumann kernels and the regularized Laplac...

متن کامل

Relation Guided Bootstrapping of Semantic Lexicons

State-of-the-art bootstrapping systems rely on expert-crafted semantic constraints such as negative categories to reduce semantic drift. Unfortunately, their use introduces a substantial amount of supervised knowledge. We present the Relation Guided Bootstrapping (RGB) algorithm, which simultaneously extracts lexicons and open relationships to guide lexicon growth and reduce semantic drift. Thi...

متن کامل

Learning Semantic Lexicons using Graph Mutual Reinforcement based Bootstrapping

Bootstrapping has been received a amount of attentions in many fields and achieved good results. While semantic lexicons also have been proved to be useful for many natural language processing tasks. This paper presents an approach to learn semantic lexicons using a new bootstrapping method which is based on Graph Mutual Reinforcement. The approach uses only unlabeled data and a few of seed wor...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007